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Three families of Bacillus cyclic lipopeptides—surfactins, iturins, and fengycins— have well-recognized potential uses in bio- 
technology and biopharmaceutical applications. This study outlines the isolation and characterization of locillomycins, a novel 
family of cyclic lipopeptides produced by Bacillus subtilis 916. Elucidation of the locillomycin structure revealed several molecu- 
lar features not observed in other Bacillus lipopeptides, including a unique nonapeptide sequence and macrocyclization. Locillo- 
mycins are active against bacteria and viruses. Biochemical analysis and gene deletion studies have supported the assignment of 
a 38-kb gene cluster as the locillomycin biosynthetic gene cluster. Interestingly, this gene cluster encodes 4 proteins (LocA, LocB, 
LocC, and LocD) that form a hexamodular nonribosomal peptide synthetase to biosynthesize cyclic nonapeptides. Genome anal- 
ysis and the chemical structures of the end products indicated that the biosynthetic pathway exhibits two distinct features: (i) a 


nonlinear hexamodular assembly line, with three modules in the middle utilized twice and the first and last two modules used 
only once and (ii) several domains that are skipped or optionally selected. 


| n the competition for nutrients, members of the Bacillus genus 
often produce a vast array of biologically active molecules that 
potentially inhibit the development of competing organisms. The 
Gram-positive bacterium Bacillus subtilis has an average of 4 to 5% 
of its genome devoted to antibiotic synthesis and is able to pro- 
duce more than two dozen antibiotics with an amazing variety of 
structures (1). Many of these compounds, which have a peptide 
origin, are synthesized either ribosomally or nonribosomally. 
Among the nonribosomally generated amphipathic cyclic lipo- 
peptides, surfactins, iturins, and fengycins have well-recognized 
potential applications in biotechnology and biopharmaceutical 
products due to their antagonistic activities and surfactant prop- 
erties (2, 3). Furthermore, the mechanisms behind the observed 
biocontrol efficacy of different Bacillus strains have also been well 
described (4—6). Lipopeptides are able to induce systemic resis- 
tance in plants and to facilitate the multicellular behaviors of the 
producing strains, such as swarming motility, biofilm formation, 
and colony morphology (5-7). 

Surfactins, iturins, and fengycins are synthesized by nonribo- 
somal peptide synthetases (NRPSs) which exhibit a distinct mod- 
ular architecture (2, 8-10). A module is typically composed of 
three core domains, with each domain responsible for a certain 
biochemical reaction (11). Specifically, the amino acid adenyla- 
tion domain (A domain) controls the entry of substrates into the 
peptide structure by recognizing and activating a specific amino 
acid. The thiolation domain (T domain), also referred to as the 
peptidyl carrier protein (PCP), contains an invariant serine resi- 
due which is essential for the binding of a 4'-phosphopantetheine 
cofactor. The N-terminal condensation domain (C domain) is 
required for the coupling of two consecutively bound amino acids 
(12, 13). These three domains constitute a minimal elongation 
module, the basic repetitive unit of a multimodular NRPS. Fur- 
thermore, modules can be supplemented with domains that cata- 
lyze modifications of the activated amino acid, such as N-methyl- 
ation and epimerization. In some cases, when the first module of 
an NRPS complex lacks a C domain, the last module contains a 
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termination thioesterase domain (TE domain) to release the end 
product (14). The order and specificity of the modules within the 
protein template determine the sequence of the product (for type 
A, linear NRPSs) (8, 11). Genetic and biochemical analyses have 
revealed that the modular arrangement of most lipopeptide syn- 
thetases is colinear with the amino acid sequences of lipopeptides 
(1, 2). This assembly line arrangement of the conserved catalytic 
modules and domains provides the means to construct hybrid 
NRPSs for use in the synthesis of new lipopeptide compounds 
(15-18). The prospect of creating numerous bioactive lipopep- 
tides by engineering existing lipopeptide synthetases has stimu- 
lated the search for new NRPSs responsible for lipopeptide syn- 
thesis (19-24). 

To date, only two reported kinds of biosynthetic machinery 
within the NRPS assembly line do not conform to the rule of 
colinearity. These include the type B and type C NRPSs, which 
iteratively use all of their modules and certain domains, respec- 
tively, during the assembly of a product (25-28). While our un- 
derstanding of the nonlinear NRPS biosynthetic mechanism is 
limited, it is clear that this mechanism has great potential to in- 
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TABLE 1 Primers used in this study 


Primer Size (nt) Sequence (5’-3’) Note 

LocDF 30 TTTGCATGCATGAACTATGATTTATCACAT Used for AlocD mutant construction 
LocDR 29 TTTGAGCTCAGTAAAAATGAGAGGCAATT Used for AlocD mutant construction 
LocCF 30 TTTAAGCTTAATTCAGCTCTTTATGAAGAA Used for AlocC mutant construction 
LocCR 30 TTTGAGCTCATGAAGTTTACTAATGAATAC Used for AlocC mutant construction 
OrflUF 30 TTTAAGCTTGAATAAATAATTCACGGTAAA Used for Aorfl mutant construction 
OrflUR 30 TTTICTAGAATTCAGCTGCTTTATCGTAAG Used for Aorfl mutant construction 
Orf1DF 30 TTITTCTAGACTGTTGCTTTGTCGCATAATG Used for Aorfl mutant construction 
Orf1DR 30 TTTGCATGCAAGAGTGAGTTATCCAGTTGA Used for Aorfl mutant construction 
Orf3F 30 TTTAAGCTTCATTGAACTGAATAAAATGTA Used for Aorf3 mutant construction 
Orf3R 30 TTTGAGCTCTTAGTCTCAATCTCAATGTTT Used for Aorf3 mutant construction 
Orf7F 30 TTTGCATGCATACGATTAATAAAAGATATG Used for Aorfl mutant construction 
Orf7R 30 TTTGAATTCTTAAATTGTTATATCATCTTT Used for Aorfl mutant construction 
LocAAIF 32 TTTCCATGGTATCTGAAATAGAAATGATTACG Used for Al domain expression in E. coli 
LocAAIR 30 TTITCTCGAGTTCTTTCTGAATAGCTGTTTG Used for Al domain expression in E. coli 
LocBA2F 33 TTTCCATGGTAAAAGATGTAGAAATTATTACAG Used for A2 domain expression in E. coli 
LocBA2R 33 TTITCTCGAGCATTTGTTCTTTTTTATTATTTGG Used for A2 domain expression in E. coli 
LocBA3F 29 TTTCCATGGTTTCGGAGATTGATATCACG Used for A3 domain expression in E. coli 
LocBA3R 29 TTI TCTCGAGGGTTTCCTGAACATCATTCT Used for A3 domain expression in E. coli 
LocBA4F 31 TTTGGATCCATCTCTTCTATAGATATCATGA Used for A4 domain expression in E. coli 
LocBA4R 32 TTTCTCGAGTTATGTTTCCTGCAAAACTGTTT Used for A4 domain expression in E. coli 
LocCA5F 28 TTTCCATGGGAGATGTAGGTTTGCTGAC Used for A5 domain expression in E. coli 
LocCA5R 33 TTTCTCGAGAACTAATTCTTTTTCAATGTTATT Used for A5 domain expression in E. coli 
LocCA6F 32 AAACCATGGTATATCAAATTAACATGATGACT Used for A6 domain expression in E. coli 
LocCA6R 30 AAACTCGAGTTTTGTCTCTGTTTCGTTTCT Used for A6 domain expression in E. coli 
PEDF 24 CGCAAAGACTGAACCCACTAATTT Used for quantitative analysis of PEDV 
PEDR 24 TIGCCTCTGTTGTTACTTGGAGAT Used for quantitative analysis of PEDV 


troduce structural diversity to secondary metabolites through 
combinatorial biosynthesis (25, 29, 30). It is necessary to propose 
nonlinear NRPS biosynthetic models that can further unravel the 
details of this biosynthetic mechanism. The Gram-positive bacte- 
rium B. subtilis serves as a model organism and is intensively used 
in the heterologous expression of commercial metabolites (30- 
32). However, apart from the present study, no nonlinear NRPS 
assembly line has been observed in this well-characterized species. 

The commercial strain B. subtilis 916 was isolated from paddy 
soils in Jurong County, Jiangsu, China, and has been reported to 
be effective in the biocontrol of plant diseases (33). While NRPSs 
for lipopeptide production in B. subtilis 916 were reported re- 
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cently, and the locillomycin also described briefly (34), in the pres- 
ent study we describe in detail the novel structure and unique 
biosynthesis of the new locillomycin family of nonribosomal lipo- 
peptides and its biological function. In this study, our results 
strongly suggest that locillomycins have molecular features not 
observed in the three families of surfactins, iturins, and fengycins, 
including a unique nonapeptide sequence and macrocyclization. 
Locillomycins are active against bacteria and viruses, with low 
hemolytic activity, and can be used in natural health products for 
therapeutic applications. While the end products were straightfor- 
ward, the proposed biosynthetic pathway of locillomycins con- 
tained several atypical aspects that were not predictable by bioin- 


FIG 1 Organization of the locillomycin gene cluster and surrounding genes on the B. subtilis 916 chromosome. The same loci of B. subtilis 168 and B. 
amyloquefaciens FZB42 are also shown for comparison. Genes with high degrees of homology between the three strains (>95% identity) are shown in yellow. The 
biosynthetic clusters for locillomycins and sporulation killing factor are shown in blue and purple, respectively. Other, unrelated genes are also shown in green, 


red, and orange. 


6602 aem.asm.org 


Applied and Environmental Microbiology 


October 2015 Volume 81 


Number 19 


formatic analysis. Our study thus adds to the knowledge regarding 
nonlinear NRPS biosynthesis mechanisms used by B. subtilis and 
further broadens the potential of strain 916 as a model system for 
nonlinear NRPS studies. 


MATERIALS AND METHODS 


Cultivation, isolation, and purification. Luria-Bertani (LB) broth was 
inoculated with B. subtilis 916 to a concentration of 5.5 X 10° cells/ml. 
One hundred cultures of 330 ml each in 2-liter Erlenmeyer flasks were 
incubated with agitation at 180 rpm at 28°C for 3 days. After removing the 
biomass by centrifugation, the broth was titrated to pH 2.8 with concen- 
trated hydrochloric acid, and the resulting gray precipitate was extracted 
with 150 ml methanol (MeOH). The MeOH extract was added to 300 ml 
H,O and titrated to pH 7 with 5 M NaOH. The extracted impurities were 
added to an Agilent amino solid-phase extraction column and sequen- 
tially washed with 50% (vol/vol) MeOH-H,O, 100% MeOH, 1% (vol/vol) 
formic acid-MeOH, and 2% (vol/vol) formic acid-MeOH. The 2% (vol/ 
vol) formic acid-MeOH eluate was concentrated by nitrogen drying to 10 
mg/ml, loaded on an Agilent C,, solid-phase extraction column, and se- 
quentially washed with 40, 42, 46, and 48% (vol/vol) MeOH-H,0O. The 44, 
46, and 48% (vol/vol) MeOH-H,0O eluates were concentrated to 20 mg/ml 
by nitrogen drying. These fractions were further processed by high-pres- 
sure liquid chromatography (HPLC), using repetitive 50-l injections, a 
reversed-phase column (RP-18; 5 wm by 4 mm by 250 mm; Merck), and 
a 0.5-ml/min flow rate, and were monitored at 230 nm. Fraction I yielded 
derivative A, which was collected at an average peak retention time of 9 
min; fraction II yielded derivative B, with a retention time of 13 min; and 
fraction III yielded derivative C, with a retention time of 18 min. The 
fractions were eluted with 50% (vol/vol) CH;CN, 0.5% (vol/vol) trifluo- 
roacetic acid in H,O. All collections were concentrated by nitrogen and 
vacuum drying and resulted in the purified locillomycin A, B, and C de- 
rivatives. 

NMR structure determination. All nuclear magnetic resonance 
(NMR) spectra were acquired at 25°C on a Bruker Avance II 600-MHz, 
Agilent DD2 500-MHz (for '°C-detected spectra), or Agilent DD2 600- 
MHz spectrometer equipped with a cold probe. Locillomycin A (~8 mg) 
was dissolved in 500 ul aqueous solvent (20 mM phosphate buffer, pH 6.5, 
D,O-H,0 [9:1 {vol/vol}]), while ~8-mg samples of locillomycins B and C 
were dissolved in 500 wl CD,OH to make the NMR samples. For each 
sample, a series of one-dimensional (1D) and 2D spectra, including spec- 
tra for 'H, °C, distortionless enhancement by polarization transfer 
(DEPT), 'H-'H correlation spectroscopy (COSY), 'H-'H total correla- 
tion spectroscopy (TOCSY), 'H-'H rotating-frame nuclear Overhauser 
effect spectroscopy (ROESY), 'H-'°C heteronuclear single quantum co- 
herence (HSQC), and 'H-'°C heteronuclear multiple-bond correction 
(HMBC), were acquired for structure elucidation. The spin-lock time for 
each TOCSY experiment was 80 ms, and the mixing time for each ROESY 
experiment was 200 ms. All NMR data were processed and analyzed with 
MestReNova. The chemical shifts in the aqueous solvent were referenced 
to the single-deuterium hydrogen oxide (HOD) peak at 4.77 ppm, while 
those in CD;OH were referenced to the peak of the methyl group, at 3.30 
ppm. Chemical shifts of °C were referenced indirectly (35) or to the peak 
of CD;OH, at 49 ppm. Sequential assignment for the peptide part was 
done using the protocol developed by Wiithrich (36). 

Gene disruption of B. subtilis 916 and mutant analysis. In vivo gen- 
eration of targeted mutations in B. subtilis 916 was achieved by a modified 
protocol originally developed for B. subtilis 168 (9). The knockout plas- 
mids were constructed from vector pUC]19 by inserting homologous frag- 
ments from B. subtilis 916 genomic DNA and a neomycin resistance cas- 
sette from pBEST501. An example for disruption of LocD (AlocD) is 
detailed below. A 2.5-kb fragment was amplified by a PCR using the prim- 
ers LocDF and LocDR and cloned into vector pUC19, generating pLocD1. 
The latter was digested with XbaI, which cuts in the middle of the PCR 
fragment. Simultaneously, pBEST501 was cut with the same enzyme to 
obtain the neomycin resistance cassette (~1.3 kb), which was then ligated 
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FIG 2 HPLC-MS chromatograms for locillomycins produced by wild-type B. 
subtilis 916 and its AlocD mutant. (A) Locillomycins A to C were detected in B. 
subtilis 916 broth cultures but were below the detectable level in AlocD cul- 
tures. (B) MS spectra for extracts of the AlocD mutant (a to c) and wild-type B. 
subtilis 916 (d to f). Molecular weights of locillomycins A, B, and C were 
determined for B. subtilis 916 broth cultures, but these compounds were not 
observed in the AlocD mutant culture broth. 


to pLocD1, resulting in pLocD2. This was subsequently transformed into 
the naturally competent strain B. subtilis 916, in which it was introduced 
into the genome via double-crossover homologous recombination. The 
disruption of the locD gene in neomycin-resistant colonies was confirmed 
by PCR with appropriate primers and by Southern hybridization. The 
locillomycins were extracted from the mutants and further analyzed by 
HPLC-mass spectrometry (HPLC-MS) as described above. 

Heterologous expression and purification of internal adenylation 
domains. The pET28a expression system was used to clone PCR products 
in which NcoI and Xhol sites were introduced by use of 5’-modified 
primers (Table 1). PCR amplification from the B. subtilis 916 chromo- 
some was carried out in a 50-ul reaction volume containing 2.5 U ExTaq 
DNA polymerase, 5 wl 10 Mg-free reaction buffer, a 200 uM concen- 
tration of each deoxynucleoside triphosphate (dNTP), 2.5 mM MgCL, 
and 1 uM (each) primers. Amplification conditions were as follows: 30 
cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 3 min, followed bya final 
extension at 72°C for 10 min. PCR products were digested with NcoI and 
XhoI and cloned into the digested pET28a vector. PCR products cloned 
into the Xhol site of pET28a resulted in appendage of a poly-His tag at the 
carboxyl ends of recombinant proteins. Transformants of Escherichia coli 
BL21(DE3) were selected by growth at 28°C in LB medium containing 100 
ug/ml ampicillin. Cells were induced with 1 mM isopropyl-B-p-thioga- 
lactopyranoside (IPTG) at an optical density at 600 nm (OD, 99) of 0.6 and 
allowed to grow for an additional 4 h before being harvested. Purification 
of the His,-tagged proteins was carried out by Ni?” -affinity chromatog- 
raphy. The proteins were separated by 10% (wt/vol) SDS-PAGE, stained 
with Coomassie blue, and analyzed by density scanning using an image 
analysis system (Bio-Rad). 

Pyrophosphate exchange assay. Amino acid-dependent ATP-so- 
dium pyrophosphate assays were performed by using the spectrophoto- 
metric assay furnished by an EnzChek pyrophosphate assay kit (Molecu- 
lar Probes). Each 200-wl reaction mixture contained 75 mM Tris-HCl 
(pH7.5), 10 mM MgCl, 5 mM dithiothreitol (DTT), 5 mM ATP, 400 mM 
MesG, 0.2 U purine nucleoside phosphorylase, 0.2 U inorganic pyrophos- 
phatase, and 2 mM amino acids. A reaction mixture without amino acids 
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FIG 3 Important nuclear Overhauser effect (NOE) and HMBC interactions for determination of the attachment site for the acyl group and the cyclic structure 
of the peptide moiety are illustrated in the structure (top right) and the spectra (top left and bottom). The ROESY (top left) and HMBC (bottom) spectra are also 
labeled with assigned signals along the sides. The assigned nucleic names of the key residues are underlined and displayed in red, and the red circles show relevant 


NOE and HMBC interactions. 


was used as the control. The reaction was initiated by the addition of 
enzyme, and the mixture was incubated at 30°C for 30 min. The absor- 
bance at 360 nm was measured in a Perkin-Elmer Lambda 6-vis spectro- 
photometer. All reactions were performed in triplicate. 

Nucleotide sequence accession numbers. The complete genome se- 
quence of B. subtilis 916 and the sequence of the gene cluster for biosyn- 
thesis of locillomycins described in this work have been submitted to 
GenBank under accession numbers CP009611 and KF866134, respec- 
tively. 


RESULTS AND DISCUSSION 

Identification and in vivo gene disruption of loc gene cluster. 
Identification and bioinformatic analyses of the loc gene cluster 
were employed as described in our previous work (34). The draft 
genome and complete genome sequences of B. subtilis 916 were 
recently analyzed (GenBank accession no. AFSU00000000.1 and 
CP009611), and four NRPS gene clusters were identified (see Fig. 
S1 in the supplemental material) (37). In addition to the three 
conventional gene clusters, srf (for surfactins), bmy (for bacillo- 
mycin Ls), and fen (for fengycins), B. subtilis 916 also contains a 
fourth NRPS gene cluster, loc (GenBank accession no. KF866134) 
(34). Further analysis of the loc open reading frames led to the 
conclusion that the Joc gene cluster was a potential candidate for 
the biosynthesis of an unknown family of lipopeptides called locil- 
lomycins. Interestingly, this particular locus in B. subtilis 916 
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shares a high degree of homology (>95% identity) with the same 
locus found in B. subtilis 168 and Bacillus amyloquefaciens FZB42 
(Fig. 1). 

Four NRPS genes, designated locD, -A, -B, and -C, are contig- 
uous in the Joc gene cluster and correspond to six peptide exten- 
sion modules (Fig. 1). The first gene, locD, encodes a 145.8-kDa 
protein containing the polyketide synthase module and including 
domains with homology to the proteins involved in the synthe- 
sis of fatty acids and polyketides: a fatty acid acyl-coenzyme A 
synthetase (ACS) domain, an acyl carrier protein/thiolation 
domain (T domain), and a B-ketoacyl synthetase (KAS) do- 
main. The second gene, locA, encodes a 267.4-kDa protein with 
only one extension module, containing the core elongation 
domains: the C, A, and T domains. Two C domains exist inde- 
pendently upstream of the A domain. The third gene, locB, 
encodes a 443.1-kDa protein containing three extension mod- 
ules. The first module contains a predicted epimerization (E) 
domain, suggesting that the corresponding amino acid appears 
in the p-configuration. The fourth gene, locC, encodes a 243.9- 
kDa protein containing two extension modules and a termina- 
tion module, which contains a C-terminal TE domain. A sim- 
ilarity analysis showed that the nucleotide sequence of the loc 
gene cluster has low similarity (<50%) to other lipopeptide 
NRPS gene clusters (see Table S1 in the supplemental mate- 
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FIG 4 Structural comparison of locillomycins, surfactins, iturins, and fengycins. Some of the key bonds are colored red and blue for clarity. The residue numbers 
are labeled next to the a-carbons. Statistical sequences and the configuration for each residue are listed below the formulas. 


rial). In addition, the combination and organization of the 
enzymatic components of loc are uniquely structured (Fig. 1). 

It was previously shown that the amino acid specificities of the 
individual modules can be predicted by comparing the active site 
residues of known NRPS A domains (see Table S2 in the supple- 
mental material) (38). According to the colinearity rule, the locil- 
lomycins biosynthesized by the loc gene cluster may be composed 
of a long fatty acid chain of 13 to 18 carbons linked to a hexapep- 
tide moiety, with molecular masses ranging from 800 to 1,000 Da. 

A set of gene disruption experiments were carried out as pre- 
viously described to verify the identified NRPS loc gene cluster and 
to test the necessity for various genes in the biosynthesis of locil- 
lomycins. The knockout targets included locD (encoding a 
polyketide synthase module), locC (including the coding sequence 
for a termination module), orfl (encoding an upstream ABC 
transporter protein), orf3 (encoding a putative regulatory protein 
containing the sensor histidine kinase), and orf7 (encoding a pu- 
tative multidrug transporter protein). The primers used in this 
study are listed in Table 1. The production of locillomycins was 
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completely abolished in the AlocD and AlocC strains, which dem- 
onstrated that these two genes were essential for the biosynthesis 
of locillomycins. The disruption of orf3 also significantly reduced 
the yield of locillomycins compared to that of the wild type (re- 
duced to <5%), with only trace amounts of locillomycins as indi- 
cated by mass ion extraction. This result suggested that orf3 plays 
an important role in the regulation of locillomycin biosynthesis. 
Contrary to our expectations, the disruption of orfl and orf7 had 
no apparent impact on the production of locillomycins (see Fig. 
S3 in the supplemental material). These results suggest that a va- 
riety of transporter proteins may be involved in the transportation 
of locillomycins. In general, the in vivo gene disruption experi- 
ments suggested that the identified Joc gene cluster isolated from 
B. subtilis 916 is responsible for the biosynthesis of locillomycins. 

Given the obvious differences in molecular masses of locillo- 
mycins detected by HPLC-MS (1,145.6, 1,159.6, and 1,173.6 Da) 
(Fig. 2) and the molecular masses predicted by the bioinformatic 
analysis (800 to 1,000 Da), the results strongly suggested that the 
biosynthesis of locillomycins in B. subtilis 916 does not obey the 
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colinearity rule. To further elucidate the mechanism of biosynthe- 
sis of locillomycins, it was necessary to characterize the structures 
of locillomycins and to detail the six internal adenylation domains 
encoded by the loc gene cluster. 

Isolation and structural characterization of locillomycins. 
The procedures for cultivation, isolation, and purification of locil- 
lomycins are described in detail in Materials and Methods. Briefly, 
the purification method consisted of organic solvent extraction, 
amino solid-phase chromatography, C,, solid-phase chromatog- 
raphy, and repetitive HPLC purification (see Fig. S4 in the supple- 
mental material). Detailed NMR and MS approaches were pur- 
sued simultaneously with the gene disruption analysis work to 
determine the chemical structures of the locillomycins. Locillo- 
mycin C was used as a representative example to explain the struc- 
tural characterization of all locillomycins (Fig. 3). The amino acid 
residue types and connectivity were obtained by analyzing the 
ROESY and TOCSY spectra (36). We concluded that there are 
nine residues in locillomycin C and that these residues are con- 
nected in the sequence Thr1-Gln2-Asp3-Gly4-Asn5-Asp6-Gly7- 
Tyr8-Val9. These observations were consistent with the data from 
the tandem MS analysis (see Table S5 and Fig. S8). Several Thr1- 
Val9 interactions observed in the ROESY spectrum indicate that 
these two residues are close in space, implying the possibility of a 
cyclic structure connected end to end. The existence ofa long alkyl 
group is evident from the large peaks around 1.2 to 1.4 ppm in the 
1H spectrum and corresponding peaks in the 1°C spectrum. Pro- 
tonated carbons were assigned after the assignment of most pro- 
tons, with the aid of the '*C-edited (CH, negative, CH,, and CH 
positive) HSQC spectrum. All known fragments of the compound 
accounted for 13 carbonyl signals, while there were 14 carbonyl 
signals in the ‘°C spectrum. This indicated that there must be a 
carbonyl group in the long alkyl chain of this compound. By ana- 
lyzing both the HMBC and ROESY spectra, the attachment site 
between the carbonyl alkyl chain (actually a long-chain acyl 
group) and the peptide moiety was determined. Moreover, both 
spectra confirmed that the peptide chain was indeed cyclical in 
nature, through the end-to-end connection of Thr1 and Val9 (Fig. 
3). Finally, the number of CH, groups was determined by inte- 
grating the multi-CH, area, and the result is in accordance with 
the results of mass spectrum analysis. No fragment from the alkyl 
group was identified by MS, and neither the DEPT nor proton 
spectrum indicated that there was branching in this alkyl group. 
Stereochemistry analysis of the amino acids was performed by 
HPLC after complete degradation of the compounds (see Fig. S9). 
Only one of the nine amino acids was found to be in the D-con- 
figuration (D-Gln). The resulting chemical structure is shown in 
Fig. 4. As for locillomycins A and B, analysis of NMR and MS 
spectra showed that they were very similar to locillomycin C, and 
there was a difference in molecular mass of only 14 Da between the 
different locillomycins (C > B > A) (see Tables S3 to S5 and Fig. 
S6 to S8). This difference is derived from the different lengths of 
the long-chain acyl group shown in Fig. $10, and the method of 
structural determination is not detailed here. 

Unlike the surfactin and iturin lipopeptide families, in which 
the peptide residues are interlinked with a B-hydroxy or B-amino 
fatty acid to forma cyclic lactone or lactam, the locillomycins form 
an internal lactone ring within the peptidic moiety. This construc- 
tion is more similar to that of the fengycin family, which also 
forms a lactone without the participation of a B-hydroxyl fatty 
acid. Importantly, however, the number of residues in locillomy- 
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FIG 5 Expression, purification, and relative substrate specificities of internal 
adenylation domains (A domains). (A) Expression and purification of the A 
domains were analyzed in 10% SDS-polyacrylamide slab gels. (B) A domains 
were investigated in terms of activity in the ATP-PP; exchange reaction, us- 
ing the amino acids of locillomycin and a control without amino acids. The 
highest activities were set at 100%. The background was below 5%. The spec- 
ificities of the different domains coincide with the primary structures of the 
locillomycins. 


cins is 9 rather than the 10 observed in the fengycin family (8 of 10 
residues form the lactone ring) (Fig. 4). These nine-membered 
cyclic lipopeptides are also different from the iturin and surfactin 
families, which form seven-membered rings. Furthermore, only 
one D-amino acid (D-Gln2) is incorporated into the members of 
the locillomycin family, while all the members of the other three 
families possess more than two D-amino acid residues. The lengths 
of the fatty acid chains are similar to those of other Bacillus lipo- 
peptides. 

Biochemical investigation of internal adenylation domains. 
To further elucidate the biosynthetic process for locillomycins, 
DNA fragments encoding the adenylation domains of modules 
LocA1 (A1), LocB1 (A2), LocB2 (A3), LocB3 (A4), LocC1 (A5), 
and LocC2 (A6) (see Fig. 6) were amplified from chromosomal B. 
subtilis 916 DNA and cloned into an IPTG-inducible expression 
vector as described in Materials and Methods. The constructs were 
confirmed by sequencing of the fusion sites. The enzyme frag- 
ments were overexpressed in E. coli as Hisę-tagged proteins and 
were purified by Ni?” -affinity chromatography. All proteins were 
found within the soluble fraction after French press lysis of E. coli 
cells, as confirmed in Coomassie blue-stained SDS-polyacryl- 
amide gels (Fig. 5A), with calculated molecular masses of 60.0 to 
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FIG 6 Proposed biosynthetic pathway for locillomycins. Nonribosomal peptide synthetase genes (locA, locB, and locC) are represented by arrows, and domains 
encoded by the respective genes are shown underneath; the domains encoded by locB are proposed to act iteratively. The fatty acid chain (FA) is coupled to the 
peptide by formation of an amide bond with the amino group of the threonine residue, and a nonapeptide is cyclized by formation of a macrolactone between 
the threonine hydroxyl and the valine carboxylate. The KAS domain is skipped during the synthesis and is labeled with a dashed box, while domains used 
optionally are labeled with purple boxes. locB appears twice because it is used iteratively and the A2 domain exhibits different specificities in the two rounds. The 
E domain following the A2 domain also loses its function in the second round. 
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65.0 kDa. The purified adenylation domains were biochemically 
investigated with respect to their activity and specificity in the 
ATP-PP; exchange reaction. In order to determine amino acid 
specificity, each protein was incubated with all constituent amino 
acids of locillomycins (the control was incubated without amino 
acids). The six adenylation domains were found to activate the 
amino acids corresponding to the constituent amino acids of locil- 
lomycins. Specifically, Al exclusively activated L-Thr; A2 activated 
L-Gln (100%), L-Asn (80%), and L-Asp (10%); A3 activated L-Asp 
(100%), L-Asn (15%), and L-Gln (10%); A4 exclusively activated 
Gly; A5 activated L-Tyr (100%) and L-Val (15%); and A6 activated 
L-Val (100%) and L-Asn (10%) (the highest activity was set at 
100%) (Fig. 5B). Al, the only adenylation domain of LocA, exclu- 
sively activated L-Thr and is likely to be involved in position 1 
amino acid biosynthesis. A2, A3, and A4, the three adenylation 
domains of LocB, are likely to act iteratively to incorporate the 2nd 
to 7th amino acids. In particular, A2 was able to efficiently activate 
both L-Gln and t-Asn. A5 and A6, the last two adenylation do- 
mains of LocC, optimally activated L-Tyr and L-Val, respectively, 
which is in agreement with the last two amino acids of locillomy- 
cins. 

Proposed locillomycin biosynthesis pathway. The gene clus- 
ter for locillomycin biosynthesis identified in the present study 
sets the stage for further delineating the intricate chemical assem- 
bly of this family of lipopeptide antibiotics. Based on the chemical 
structures of locillomycins, the analysis of in vivo gene disrup- 
tions, internal adenylation domain experiments, the amino acid 
sequence homology of loc-encoded proteins to enzymes, and pre- 
vious knowledge of lipopeptide biosynthetic pathways, we pro- 
pose a model for the locillomycin biosynthetic pathway (Fig. 6). In 
brief, it is a pathway where a cyclic lipononapeptide is assembled 
by multifunctional enzymes of a hexamodular NRPS. In the initial 
step, the ACS domain couples coenzyme A to a long-chain fatty 
acid, presumably a tridecanoic to pentadecanoic acid, in an ATP- 
dependent reaction. The activated fatty acid is then transferred to 
the 4-phosphopantetheine cofactor of the T domain. In the sec- 
ond step, the fatty acid is coupled to the activated threonine di- 
rectly, catalyzed by one of the condensation domains preceding 
the Al domain of the peptide synthetases, with the KAS domain, 
two T domains, and one C domain skipped (Fig. 6). In subsequent 
condensation reactions, the donor part, containing the fatty acid 
and thiolated threonine, is connected to the acceptor part, con- 
taining the A2 domain-activated glutamine, which is epimerized 
by the E domain following A2 before the next extension step (Fig. 
6). The process continues in a canonical fashion until the A4 do- 
main-activated glycine is connected. When this occurs, the whole 
LocB peptide is used a second time for further extension (Fig. 6). 
This time A2 activates asparagine rather than glutamine, which is 
connected to the donor without epimerization. Extension then 
proceeds through LocB and LocC all the way down to the TE 
domain, which catalyzes the cyclization of the synthesized linear 
peptide by connecting the carbonyl group of Val9 and the hy- 
droxyl group of the Thr1 side chain before releasing the mature 
cyclic lipopeptide of locillomycins (Fig. 6) (39). One should notice 
in particular that the first A domain (A1) and the last two A do- 
mains (A5 and A6) are used only once, whereas the three A do- 
mains in the middle (A2, A3, and A4) are used iteratively during 
the biosynthesis (Fig. 6). Moreover, the A2 domain in LocB some- 
how exhibits a different specificity in the iteration. Also, the epi- 
merization domain following the A2 domain, which is responsible 
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for conversion of L-Gln to D-Gln the first time, is skipped the 
second time (Fig. 6). The different selectivities of these catalytic 
domains within the same NRPS assembly line are quite unusual 
compared to other NRPS assembly lines. Thus, compared to the 
iterative uses in type B and C NRPSs, the loc-encoded NRPS was 
designated a type D NRPS (Fig. 6) (25-28). 

A typical lipopeptide biosynthesis pathway involves the KAS 
domain catalyzing the formation of a B-ketoacyl thioester, which 
is subsequently converted into a B-substituted fatty acid by a 
transamination or transhydroxylation reaction. This is followed 
by transfer of the fatty acid to a T domain and is coupled to the first 
activated amino acid. In contrast, with regard to the synthesis of 
locillomycin, only the ACS domain, not the KAS domain, is re- 
sponsible for the activation of the fatty acid. The skipping of the 
KAS domain may be due to the lack of any functional domains, 
such as an aminotransferase or hydroxyl transferase, downstream 
of the KAS domain. The functions of the redundant T and C 
domains before LocA are unclear, since one condensation domain 
preceding the first A domain of the peptide synthetase is enough to 
couple the fatty acid to the first activated amino acid (2). While a 
few of these atypical biochemical features of single-module or 
multiple-module iteration have been observed previously, the 
locillomycins represent a rare, nonlinear NRPS biosynthetic 
model with a combination of iteration, skipping, alternation, and 
specificity changes (Fig. 6). 

Biological functions of locillomycins. Biological activity as- 
says revealed that locillomycins have moderate antibacterial 
activities (see Fig. S11 in the supplemental material). The antibac- 
terial MICs of locillomycins A, B, and C were 24.3, 18.8, and 17.6 
ug/ml, respectively, for methicillin-resistant Staphylococcus au- 
reus (MRSA) and were 6.3, 5.8, and 5.4 pg/ml, respectively, for 
Xanthomonas oryzae pv. oryzae. Antiviral results for different 
locillomycin concentrations revealed that porcine epidemic diar- 
rhea virus (PEDV) infection could be inhibited effectively by these 
compounds. In particular, at a concentration of 10 ug/ml, locil- 
lomycins reduced the number of virus copies 300-fold and were 
about 10 times as efficient as surfactins (see Fig. S12). Locillomy- 
cins also have lower hemolytic activities than surfactins and are 
presumed to reduce toxicity against eukaryotic cells, which could 
enhance their potential for use in therapeutic applications (see 
Fig. $12) (3). 
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